Degree Correlations in Random Geometric Graphs 
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Spatially embedded networks are important in several disciplines. The prototypical spatial net- 
work we assume is the Random Geometric Graph of which many properties are known. Here we 
present new results for the two-point degree correlation function in terms of the clustering coefficient 
of the graphs for two-dimensional space in particular, with extensions to arbitrary finite dimension. 

PACS numbers: 02.10.Ox, 89.75.-k 

In the last decade, thanks to abundant data, new models, and adequate software tools, complex networks have been 
thoroughly investigated in many disciplines and as a substrate of many phenomena. A synthesis is now emerging, as 
can be seen e.g. in the recent comprehensive treatment by Newman Most of this work has dealt with "relational" 
networks, i.e. graphs in which distances do not have physical meaning and are just dimcnsionless quantities measured 
in terms of edge hops. Indeed, many networks are mainly of this kind such as big social networks. However, in 
many cases the physical space in which networks are embedded and the actual distances between nodes are important 
such as in rail and road networks, ad hoc communication device networks, and other geographical and transportation 
networks. The recent comprehensive review by Barthelemy Q has at last put together a large amount of scattered 
material on spatial networks. The Random Geometric Graph (RGG) is a standard spatial network model that plays 
a role for spatial networks similar to the one played by the Erdos-Rcnyi random graph for relational ones. This model 
is well known @-Q but some of its second-order features have not yet been uncovered. Among these, there is the 
question of the degree correlation functions. In this Communication wc present some results on degree correlations 
on RGGs that we believe were previously unknown. 

The construction process of a RGG with N nodes and radius R can be summarized as follows 0] : 

• the N nodes are placed on the unitary space G M'^ with uniform distribution. 

• an edge is created for every pair of nodes whose distance is r < R. 
In particular, the distance is given by some metric on 51. 

In this work we have dealt with 2-dimensional RGGs and the Euclidean metric distance on K^. The unitary space 
is the square [0, 1]^ with no boundary conditions (torus). We call the neighborhood area Vx of node X the set of the 
points which are at distance lower than the radius R from X. In Fig. [Ta| three nodes X, Y, Z and their neighborhood 
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FIG. 1: (a) Neighborhood area, (b) An example of RGG with N = 100 and R = 0.13 (with boundary conditions). 
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areas have been depicted. X and Y are connected through an edge since they are within the neighborhood area of 
each other respectively, while Z is not connected to Y even if it is sharing a common area with it. Fig. Ilbl shows a 
RGG realization with N = 100 and R — 0.13; in this case, for illustrative purposes, we have considered the boundary 
conditions on O. 

The average degree k = k{N, R) of a RGG can be easily estimated by the formula k = pV , where p is the node 
density, representing the number of nodes within a unit space, and V is the neighborhood area volume. In this case 
p = N, since is an unitary space, and V = ttR^. In conclusion, k = ttNR^. According to this result, it is possible 
to consider A: as a parameter of RGGs, instead of the radius R. Therefore, in order to construct RGG with an average 
degree that tends to A; as A^, 1/i? — > oo, it is sufficient to use the radius R = \J k/{-KN). 

The degree distribution of RGGs can be estimated regarding the probability density function of having a node X 
of degree fc, given that there are other N nodes uniformly distributed in O. More precisely, N ~ 1 other nodes, but 
TV ^ A — 1 for large values of A^. This probability follows the binomial distribution and it is equal to: 

where p = ttR^ , since it represents the proportion between V and Q,. The Poisson distribution with parameter A = Np 
can be used as an approximation of the binomial distribution if A^ is sufficiently large and p is sufficiently small. In 
this case the degree distribution will be approximated by: 

k\ 

where A = fc. 

Fig- m shows three RGG cumulative degree distributions with the same number of nodes and different average degree. 
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FIG. 2: Cumulative degree distribution functions for three realizations of RGG with A^ — 10000 and three different 

average degrees. 

The average clustering coefficient is given averaging on all node's individual clustering coefficients [l|, [H. This 
property on RGGs was extensively studied in the work of Dall and Christensen Q , in which they have found the law 
for the average clustering coefficient as a function of the dimension of the space. Here the dimension is equal to 2, and 
it is possible to demonstrate that the average clustering coefficient C2 tends to = 1 — ~ 0.5865, for large values 
of A^ and for all 2-dimensional RGGs [3| in the Euclidean space. By analogy, we shall call Cd the average clustering 
coefficient of a c?-dimensional RGG. 

This important result depends on the particular construction of RGGs. The average clustering coefficient tends to 
the ratio of the average shared neighborhood area of two connected nodes and the whole neighborhood area. It is 
clear that changing the radius R this fraction maintains the same value. This phenomenon, which conducts to a fixed 
average clustering coefficient for every RGGs, will be studied in depth in order to estimate the degree correlations in 
the following. Many other properties of RGGs have been studied in Penrose's book 0]. 

Due to its construction process, in a 2-dimensional RGG there is positive degree-degree correlation. This property 
is commonly detected studying the average degree of the neighborhood of a given node of degree k Q . The function 
knn{k) {nearest neighbor average degree) represents the average degree of the neighborhood of all nodes of degree k. 
The properties which emerge from the spatial construction of RGGs allow us to evaluate fc„„(fc) with a mean-field 
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method for very large values of N. 

It is possible to evaluate the average degree of neighbors node estimating the average shared area of two connected 
nodes. Fig. |3]depicts the case of two connected nodes X and Y, and their shared area A (grey). B is the complementary 
area of A to neighborhood area V = ttR^, and r is the distance between the two nodes. The area C is symmetrical 
and equivalent to B. 




FIG. 3: Two connected nodes X,Y and their neighborhood areas of radius R. The distance between them is r and 
the angle is 6* = arccos(2^). The grey area A is the shared area of the two connected nodes. B and C are the 

complementary areas of A to V. 



The shared area A{r) of two connected nodes is only dependent on the distance r between them. The formula for 
A{r) can be simply derived from those of the circular sector and is equal to: 

A{r) = R'^{2e - sin 26*), where 6 = arccos (^^) 

Thus, the average shared area A is obtained integrating and averaging A{r) on all possible neighbors Y oi X (Fig.[3|). 
We now calculate this area: 

/o"rfi^(2g-sin2gHrd0 

where variables < r < R and < < 27r represent all possible neighbors Y in the neighborhood area of X. 
The integral in the numerator is calculated by using the substitutions r = 2Rcos9 and 6 = ijj/2, which leads to: 

It follows that the average shared area is: 

According to the last result, it is possible to evaluate the ratio of A and V, which leads to: 

A A 3V3 . 

— — — — L — — A.O 

V ttR^ 47r 

where A2 ^ 0.5865 is the asymptotic value of the average clustering coefficient of 2-dimcnsional RGGs. 
Finally, using the mean-field method to evaluate the neighbors shared area, it is possible to find the expression for the 
function A:„„(fc). In order to understand better this result, it is useful to use again the notation of Fig. [31 Focusing 
on node X of known degree kx, we are trying to evaluate the average degree of its neighbor Y. This degree is given 
by two different areas, where the nodes density could be different. 

The first area which brings neighbors to node Y is the shared area A, which is approximated by A in the mean-field 
method, and where the nodes density is equal to thanks to the fact that we know the degree of X. 
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The second area in which node Y could find other neighbors is the complement C of A. In particular, approximating 
A by A we approximate also C to C = V — A. In C the nodes density is equal to p — N, since we do not have more 
information about this area. 

Putting together this information we can easily give the expression for knnikx), which represents the degree of 
node Y: 

knn{kx) = ^A + p{V -A)^A2- kx + (1 - A2)k 
But kx stands for the degree of a generic node in the graph and this leads to the final expression of the function: 

knn[k) ^ A2-k+{l- A2)k 

This function is linear in k and reveals the positive relation between the degree of a node and its average neighbors 
degree {A2 > 0). 

The degree-degree correlation is not only estimated by fc„„(fc), but it is also exactly given by the assortativity 
coefficient, which is the Pearson correlation coefficient of degree between pairs of connected nodes. This value is 
widely used as a measure of the strength of linear dependence between two variables [l| . As for the average clustering 
coefficient, we use the notation r2 to indicate the assortativity coefficient of a 2-dimensional RGG. 
As we have seen above, the function knn{k) is linear and, applying the mean-field method, it can represent the 
regression line for the two variables, degree {X) and neighbor degree (Y). We thus assume that fc„„(fc) is the 
regression line and we derive 7-2 from that. 

The regression line slope 6 tends to A2 for large values of N and is defined by the following formula: 

cov{X,Y) 

6= % ' ^ A2, N^oo 

^x 

where ax is the standard deviation of variable X and cov{X, Y) is the coviarance of the two variables X, Y. 

On the other hand, r2 by definition is the covariance of the two variables X, Y divided by the product of their 
standard deviations: 

cov{X,Y) 

r2 = 

axCFY 

It follows this last result: 

r2 ^ ^A2 
ay 

where trxjCry are the standard deviations of variables X and Y, respectively. 

In order to estimate these two standard deviations, we focus on their distribution functions fx{x) and fviy)- We 
already know that fx{x) is a Poisson distribution since it represents the degree distribution of a RGG. This implies 
that ax = k. 

However, we can find the expression of distribution fviy), which is the distribution of neighbors degrees, using the 
relation: 

fx[x\y = y*) 

The numerator function /y(y|a; = x*) is the degree distribution of neighbors of a given node of degree x* , and which 
has an expected value equal to fc„„(a;*). 

The other function fx{x\y — y*) represents the opposite case. This function is the probability distribution of nodes 
degrees given that they have a neighbor of degree equal to y* . 

Since the two functions are completely equivalent because of the symmetry of their definitions, we can conclude that 

fviy) = fx{x), and, consequently, ax = ay ■ 

We can then conclude that 7-2 — >■ A2 for large values of N . 

In Fig. |3]we depict fc„„(fc), theoretical and empirical, in a RGG with N = 50000 and k = 50. From the figure, one 
can conclude that there is a very good agreement with the theoretical results. 

This last result can be extended to other kinds of RGGs, with different dimensions or neighborhood volume (for 
d > 3), since it does not depend on the shapes of the neighborhood volume, but only on the ratio of the average shared 
volume and the neighborhood volume. For the RGGs this ratio is intrinsically represented by the average clustering 
coefl[icient of the graph. The individual clustering coefficient Cx [H of a node X is given by the following definition: 



Txy = #{triangles with edge {X,Y)} 
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FIG. 4: Straight thick hnc: theoretical A:„„(fc) with a slope coefficient of 0.586. Dotted curve: empirical fc„„(fc). 
TV = 50000, k = 50. Due to the peaked empirical degree distribution function (sec inset) low and high k values are 

noisier. 



which represents the number of all triangles formed with the edge {X, Y) in the graph. It follows that: 



Let us denote Ad the ratio of the average shared volume of two connected nodes and the neighborhood volume of 
d-dimensional RGGs. We find that, on average, Txy {kx — and, substituting in equation ([T]), Cx Ad- 

Thus, the average clustering coefficient Cd = jf X^jc C'x of d-dimcnsional RGGs tends to Ad- 
Now, considering the formula which estimates Cd in the Euclidean space by Dall and Christensen Q: 



we can conclude that it represents a good approximation of Ad for large values of N and d, while, for small values of 
d, this function overestimates Ad- 

The constant Ad depends on the neighborhood volume shape and represents the asymptotic value for the average 
clustering coefficient Cd and the assortativity coefficient rd- The last assertion is due to the fact that, as we have seen 
above in the case d = 2, r2 tends to the fraction A2. This process is applicable to any dimension d in order to evaluate 

Td- 

Similar analytical results can be obtained in the same way extending to higher order of degree correlations, but 
the amount of calculus becomes particularly heavy. For example, the correlation coefficient between a given node's 
degree and the degree of its neighbors at distance 2 can be obtained from the study of the function A:^„(fc), which 
represents the average degree of neighbors at distance 2. Here the distance is intended to be the relational distance 
in the graph, i.e. the number of edges that compose the minimum shortest path which connects the two nodes. From 
Figs. H] and [5] one sees that the degree correlations are non-negligible up to graph distance equal to 2 but they tend 
to disappear with increasing distance. 

In summary, we have presented new results for the degree-degree correlations in RGGs in terms of their average 
clustering coefficients, showing exact results for the two-dimensional case and extending them to arbitrary finite 
dimension. 



[1] M. E. J. Newman. Networks: An Introduction. Oxford University Press, Oxford, UK, 2010. 

[2] M. Barthelemy. Spatial networks. Physics Reports, 499:1-101, 2011. 

[3] M. Penrose. Random Geometric Graphs. Oxford University Press, Oxford, UK, 2003. 

[4] J. Dall and M. Christensen. Random geometric graphs. Phys. Rev. E, 66:016121, 2002. 




(1) 




6 



80 - 



70 - 




30 - 



20 - 

— \ \ \ \ 1 1 1 — 

20 30 40 50 60 70 80 

k 

FIG. 5: Dotted curve: empirical fc^„(fc). N = 50000, k ~ 50. Straight thick hne: hnear regression hne with slope 

coefhcient of 0.255. 
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